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Due to inherent complexity active transport presents a landmark hurdle for oral absorption prop- 
erties prediction. We present a novel approach carrier-mediated drug absorption parameters cal- 
culation based on entirely different paradigm than QSPR. We capitalize on recently emerged ideas 
that molecule activities against a large protein set can be used for prediction of biological effects 
and performed a large scale numerical docking of drug-like compounds to a large diversified set of 
proteins. As a result we identified for the first time a protein, binding to which correlates well with 
the intestinal permeability of many actively absorbed compounds. Although the protein is not a 
transporter, we speculate that it has the binding site force field similar to that of an important in- 
testinal transporter. The observation helped us to improve the passive absorption model by adding 
non-liner flux associated with the transporting protein to obtain a quantitative model of active 
transport. This study demonstrates that binding data to a sufficiently representative set of proteins 
can serve as a basis for active absorption prediction for a given compound. 



I. INTRODUCTION 

The oral route of drug administration is very conve- 
nient for patients, however it is often inefficient due to 
low solubility, intestinal permeability, or high first-pass 
effect. Therefore prediction of oral absorption properties 
is of great interest for pharmaceutical industry. Orally 
administered drugs are mainly absorbed in the small in- 
testine. Here, depending on drug composition and size, 
absorption can happen through a variety of processes 
[35l | . Drug pass through the epithelial cells and the lam- 
ina propria from the lumen into the blood stream in the 
capillaries. On its way it might be metabolized, trans- 
ported away from the tract where absorption is possible 
or accumulate in organs other than those of treatment. 
Besides a fundamental interest in understanding the ba- 
sic mechanisms by which a drug is assimilated by the hu- 
man body, the kinetics of drug absorption is also a topic 
of much practical interest. A detailed knowledge of this 
process, resulting in the prediction of the drug absorp- 
tion profile, can be of much help in the drug development 
stage 

There are a number of kinetic absorption models were 
developed that require experimentally determined in- 
testinal permeability of a compound as an input [73 ] . 
Although of great value such "hybrid" partially exper- 
imental, partially computational models miss the main 
advantages of purely theoretical approaches: no need in 
chemical synthesis of a compound and experimental facil- 
ities, low cost and high speed. Among computational ap- 
proaches that predict intestinal permeability solely from 
a molecule structure and its physical-chemical proper- 
ties instead of using any biological experiments data, 
there are two major directions: ab initio and quantitative 
structure-property relationship (QSPR) models. The last 
ones are overwhelmingly used nowadays and exploit a 
wide spectrum of statistical methods for absorption data 
analysis (see e.g. [H, El, [6l| for a review). Instead of 
relying on basic laws of nature the models are trained at 



observed statistical regularities. Such an approach pre- 
conditions the limitations of the models. In contrast to 
QSPR there are a handful of studies developing mod- 
els of the intestinal permeability from the first principles 
0, 0, H3|- The models describe successfully basic prop- 
erties of passive absorption: dependence on distribution 
coefficient, diffusional limitation at high LogD, and para- 
cellular absorption. However, the major hindrance on 
this way the complexity of intestinal absorption. Apart 
from passive phenomena (diffusion through cell mem- 
brane and paracellular junctions), there is also active 
transport of the molecules in and out of the cells. To the 
best of our knowledge current ab initio models are limited 
to description of drug passive absorption. Most of QSPR 
models also deal with passive transport [13], though only 
a few approaches go as far as developing QSPR models 
describing both passive and carrier-mediated absorption 
mechanisms However, carrier- mediated transport 

plays an important role in drug absorption [18] and hence 
demands the development of a good active absorption 
model. 

The major objective of this investigation was to de- 
velop a novel approach to prediction of carrier-mediated 
drug absorption based on entirely different paradigm 
than QSPR, thus avoiding its difficulties and capable of 
better predictions. Recently it was observed that experi- 
mental values of molecular activities against a large pro- 
tein set can be used for prediction of a broad spectrum 
of biological effects . In this study we took advantage of 
this concept and developed a novel quantitative method 
for identification of actively transported drugs. To do 
that we performed a docking study of a few hundreds 
of small molecules (mostly drugs) against a diversified 
set of 400 proteins representing human proteom. Using 
available absorption data for each of the molecules we 
identified a protein, affinity for which correlates well with 
the permeability of many actively absorbed compounds 
from our data set. The observation helped us to improve 
the passive absorption model by adding non-liner fluxes 
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associated with the transporting protein to obtain also a 
quantitative model of active transport. 

The manuscript is organized as follows. After the 
standard Materials and Methods section outlining our 
approaches to the data preparation, the docking study 
setup, and the data processing routines, we present a 
two-compartment model of drug absorption extended to 
include active transport via non-linear fluxes terms asso- 
ciated with transporting proteins. As soon as the model 
is built and the parameters of passive absorption are fit- 
ted to experimental data, we identify the active transport 
parameters to train the classifier. After the classification 
is set up we compare our predictions with available ex- 
perimental informaton and thus validate the complete 
model for drug absorption prediction. 
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II. MATERIALS AND METHODS Log10 (Caco-2 p ermeab iHy . 10&6 cmfe) 



A. Experimental absorption and permeability data 

Much experimental activity aimed to analyze the ki- 
netic aspects of the process of drug absorption has been 
pursued recently. For better control, a variety of in- 
vitro methods on drug absorption have been developed 
0|. One possibility is to seed (epithelial) cell cultures 
in a mono-layer, forming the contact surface of two little 
pots. Concentrations of an applied drug can be measured 
over time in both chambers. Two well known cell culture 
models are Caco-2 cells (B, S] and MDCK cells [j^. 

To enrich experimental data sets we used two types of 
observed data to build up the model: fraction of drugs 
absorbed after oral administration in humans (FA) and 
permeability across a human colon adenocarcinoma cell 
(Caco-2) monolayer (P). The latter is a routinely used 
cell model in pharmaceutical industry and academia to 
estimate drug absorption in the intestine [I.[l0l.l36l.l65l|). 
Previous findings showed strong relationship between 
drug Caco-2 permeability and the fraction absorbed in 
humans (e.g. @, [h], US M, EH IzH), suggesting that one 
value can be used to estimate the other. We collected 
from literature compilations 91 observed FA values and 
103 Caco-2 permeability values for 117 compounds that 
to the best of our_ knowledge are not subject to efflux 
from enterocytes 

M Hi 



58, 60, |61|, 16 




P,|2jl|24|2ap 

,111, 13, 11,11, lie 



JnllMlif 

Il2l2ll7j Fig. [U shows FA val- 
ues plotted against permeability for the compounds, for 
which both values were available. The data were fitted 
with the sigmoid equation [g|: 



Figure 1: The relationship between FA and Caco-2 perme- 
ability. The points corresponds to experimental values for 
compounds with both values of FA and P known. The solid 
line is the approximation provided by Eq. [T] RMSD is 14%. 



p = -0.5 [46]]. The fitting curve predicts FA = 90% for 
logP 90 = -4.8 and FA = 10% for logP w = -7.4, which 
is in a reasonable agreement with with logPga = —5.3 
and logP w = -6.9 from RMSD of the fitting is 

fairly small and thus Caco-2 permeability can indeed pre- 
dict human intestinal absorption of orally taken drugs 
with reasonable accuracy. Fig. CD shows that there are 
two outliers corresponding to glycylsarcosine and amox- 
icillin. Their FA were much higher than expected from 
Caco-2 permeability. Glycylsarcosine and amoxicillin are 
carried through enterocyte membranes by PEPT trans- 
porters, which are reported to have reduced activity in 
Caco-2 cells SlJ. This fact may account for observed 
discrepancy between measured Caco-2 permeability and 
FA values. 

Eq. [D can be used to estimate the missing values of 
FA and P for all the compounds from our compilation. 
However, Eq. CD requires that if FA — > 100%, then P — > 
00. Therefore, if the observed value of FA exceeded 97%, 
we assigned P value of 4 x 10 -4 cm/s corresponding to 
97% FA. 

The distribution coefficients, LogD (pH = 7.4) used 
throu gho ut the research, were either collected from liter- 
ature [H, HSJxH or calculated using Quantum software 
version 3.3.0 
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where P50 is the permeability at 50% FA, and p is a slope 
factor. The fitting parameters were P50 = 7.94 x 10~ 7 
cm • s^ 1 and p = —0.73 that is in reasonable agree- 
ment with previously found P50 ~2x 10~ 6 cm ■ s _1 and 



B. Preparation of the protein panel 

Our protein data set includes 400 proteins form the 
Protein Data Bank [68]. It covers about almost all avail- 
able cytoplasmic proteins with known X-ray structure 
and also includes some important transmembranal pro- 



teins such as ion channels and GPCRs. We use homology 
models for GPCRs since no experimentally determined 
structure is available [B§|. 

Only the proteins that are co-crystallized with biolog- 
ically active ligand were taken to the data set. Ligands 
may be either natural ligands (such as hormone for a 
hormone receptor or substrate for enzyme), or drugs, in- 
hibitors etc. If there exist multiple files in PDB reposi- 
tory for the same protein, we consider the file with the 
most complete structure and/or the lowest resolution. 

Although the choice of the proteins for the calcula- 
tions is a very important step and the overall number of 
proteins is hardly manageable, we believe that the PDB 
archive contains a representative set of the most practi- 
cally important proteins, covering the whole interesting 
variety of ligand binding domains. Below we show, that 
successful predictions do not require the presence of a 
specific ligand binder in the protein set employed for the 
calculations. Instead, it proves to be sufficient to have a 
structurally similar protein in the protein panel. 



C. Docking setup and the binding constant, Kd, 
prediction. 



Both the proteins and small molecules typization, and 
in-silico screening were carried out by the molecular pro- 
cessing and docking tools taken from the QUANTUM 
drug discovery software suit [U]. The software predicts 
the binding affinities of small molecules to resolved pro- 
tein targets using a set of first principles based molecular 
simulations with an advanced continuous water model 
[HjIlSl- The approach provides the logarithmic values of 
the binding constant, pKd (—\gKd). 

To compute the binding affinities of molecules in our 
data set we screened each of the molecules against every 
protein in our panel. To speed up the calculations the 
docking run were performed against rigid protein struc- 
tures with no further refinement by molecular dynamics. 
Such a simplified approach turned out to be sufficient (see 
the discussion below) and the results of the calculations 
were organized into screening assays containing pKd val- 
ues for each protein-small molecules pair (complexes) and 
were stored for further analysis. 



D. Data processing and modeling. 

Fitting of the experimental data to the models pro- 
posed below was performed using BFGS algorithm im- 
plemented in in-house program. Selection of proteins, 
affinity for which correlates with active absorption was 
performed using Weka v. 3. 5 data mining software [70(. 
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Figure 2: Model of absorption used in the study. The figures 
represent: 1 and 5 - drug diffusion from the balk solution of 
the donor tank to enterocytes and from enterocytes to the 
balk solution of the acceptor tank; 2 and 3 - passive and 
active penetration through cells; 4 - drug diffusion inside en- 
terocyte from the apical to basal membrane of enterocytes; 
4 - paracellular absorption of the drug. The drug dissolving 
stage in the intestinal lumen is omitted. 



III. RESULTS 
A. The model 

For the sake of simplicity we considered absorption of 
passively and actively transported drugs with negligible 
efflux and intestinal metabolism. Besides, the model as- 
sumes that the drug is good soluble and stable in the gas- 
trointestinal fluids, and absorption on intestinal content 
and intestinal metabolism are negligible. In this case the 
absorption from intestinum to blood can be represented 
by a two-compartment model (see Fig. [2]) consisting of a 
donor (intestinal lumen) and an acceptor (blood vessel) 
tanks. The intestinal wall can be represented by a single 
lipid membrane since there is no phenomena depending 
on drug concentration in enterocytes. The drug absorp- 
tion can be described as drug diffusion from the balk 
solution of the donor tank to the cell layer, penetration 
across it, and diffusion away from the layer to the balk 
solution of the acceptor tank in series. Drug penetration 
across lipid layer includes passive diffusion, active trans- 
port and diffusion through pores in the layer simulating 
paracellular absorption. 

The effective permeability coefficient, P, through a 
combination of diffusional barriers and active transports 
is determined by the following equation [27j: 
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where P paS s, -Fact, Ppara are the passive, active, paracellu- 
lar permeabilities. Pjjwl is the effective permeability of 
unstirred water layers (UWL) in the donor and acceptor 
tanks: 

p-i _ p-i — I — p-i 

MJWL — MJWL.l ^ -'UWL, 2 

The values of the permeabilities come from the Fick's law 

p _ guwM . . 

PjWL,i — t (jJ 

flUWL,i 

where £>uwL,i and /jtjwl.i are the diffusion coefficient 
and effective thickness of UWL on each side of the cell 
monolayer. £>uwL,i can be approximated by the diffu- 
sional coefficient in water, which varies within less than 
a single order of mag nitude for low molecular weight or- 
ganic compounds [H,[6§]. For sufficiently dilute solutions 
huwL is approximately constant. Thus, Puwla and ef- 
fective permeability of the UWLs, Ptjwl can be approx- 
imately treated constant for all low molecular weight or- 
ganic compounds. 

Similarly to PuwL,i, the drug diffusion through mem- 
brane, Pp aS s, can be estimated as: 

Ppass = (4) 
tlM 

where Dm is the diffusion coefficient in lipid, hyi is the 
thickness of the membrane, and D is the octanol/water 
distribution coefficient, i.e. the concentration ratio be- 
tween aqueous and lipid phases. And again, as a first 
approximation Dm can be put to a constant for various 
drug-like compounds, thus the proportionality factor be- 
tween Ppass and D can be considered as constant for all 
low molecular weight organic compounds. 

According to [2, 0], the paracellular permeability, 
Ppara, is a size- restricted diffusion within a negative elec- 
trostatic force field. Normally it varies within a single 
order of magnitude range 0, H3] and hence its variations 
can be neglected. In what follows we keep P par a constant 
everywhere. The analysis of the experimental data at our 
disposal proves that this is a very reasonable assumption 
indeed. 

To build up a model of active transport first we es- 
timated parameters of passive absorption (Ptjwl, Ppara, 
and Dm/Iim) by fitting observed permeabilities for pas- 
sively absorbed compounds with Eq. [2] Pact = 0. Then 
these values were frozen and observed permeability val- 
ues for actively transported compounds were fitted with 
Eq. [2] where P ac t was substituted by the proposed model 
of active transport. 

B. Estimation the passive absorption model 
parameters. 

To estimate the parameters of the passive absorption 
we fitted observed permeability values for drugs, which to 
the best of our knowledge are passively absorbed, with 
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Figure 3: Model of passive intestinal permeability. Intestinal 
permeability of passively absorbed compounds (filled and hol- 
low squares) is plotted against LogD. Solid line is prediction 
of the model O where Pact = CHI ■ S . J para = const. The 
parameter values see in the text. 



Eq. [2] with no active transport (P ac t = 0). Since the 
approximation contains only three adjustable parameters 
of passive absorption, there was no need in a large data 
set. Therefore we selected passively absorbed compounds 
with Caco-2 permeabilities measured directly. This was 
done because the observed FA depend on experimental 
conditions and may include effects of drug instability in 
the intestinal fluids, intestinal metabolism and so on. On 
the contrary, the data on Caco-2 permeability are free 
of those mentioned problems. On the other hand tight 
junctions of Caco-2 cell monolayer are significantly less 
permeable ficl |46| than in the intestine. 

Fig. [3]shows the logarithm of permeability of passively 
absorbed drugs (both filled and hollow squares) plotted 
against the logarithm of distribution coefficient. In accor- 
dance with previously proposed model [l4| the intestinal 
permeability of passively absorbed drugs increases with 
increase in the distribution coefficient and saturate at 
both low and high ends. The increasing part reflects 
growth in membrane permeability with increase in the 
distribution coefficient of a drug. The saturation at upper 
limit reflects diffusional limitations imposed by UWLs 
for highly lipophilic drugs. The saturation at low log-D 
corresponds to residual permeability through tight junc- 
tions. Solid line shows fitting of experimental data with 
Eq. [2] where P act = 0, Puwl, Ppara •> 

and D M /h M are 

all assumed constant for all the compounds. The best fit 
was achieved at the following values of the model param- 
eters: P para = 5.01 x 1CT 8 cm ■ s" 1 , P UW l = 2.88 x 1(T 5 
cm-s" 1 , D m I h m = 3.71 x 1(T 5 cm-s" 1 . RMSD were 
0.42 log units. 

The determined value of Ppara is slightly lower of ex- 
perimental estimations rang edfrom lO^-hlO" 6 cm ■ s 
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@, while Pjjwl is in a good agreement with the ob- 
served values at slow stirring rate (5 x 10~ 5 cm • s _1 at 
25 rpm , @])- Using the commonly accepted value of the 
diffusion coefficient, 10" 5 cm^s" 1 [EllEil, from Eq. 1 
we find the effective thickness of UWLs : 



l-UWL 



3 x 10 2 



that is in excellent agreement with previously estimated 
values between 35 and 800 fim 



witn pre- 
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C. The active absorption model. 

Using literature data pj, M, M, EE El H [z3| we 
selected 45 compounds from our database, which are 
reportedly absorbed using active transport and to the 
best of our knowledge are not subject to drug efflux 
[H, El, EE [EE [Zaj- To enrich the data set both the 
values of Caco-2 measured directly and the calculated by 
FA permeability values were used. If both FA and Caco- 
2 permeability were available for a given compound, the 
value of P calculated from the measured FA was em- 
ployed. This is a reasonable approach, since Caco-2 cells 
are known to under express some important drug trans- 
porters 

EE EH, and thus Caco-2 permeability data for 
actively transported compounds is less reliable than FA. 

Fig. 2] shows that permeability of the majority of ac- 
tively absorbed compounds were higher than predicted 
by the model of passive absorption in accordance with 
existence of an additional component of permeability. 
Nevertheless there were four outliers, which permeabil- 
ity were substantially below passive permeability curve: 
fosinopril, diphenhydramine, lobucavir, and cefuroxime 
axetil. We will speculate about possible explanations of 
this in discussion. For the rest of the compounds the to- 
tal permeability exceeded passive component from 0.06 
to 3.26 log units and reached diffusion limited rate. This 
means that the intensity of active transport varies in the 
wide range and may be limited by drug diffusion to the 
membrane. 

The carrier-mediated absorption (both active and pas- 
sive) can be described by Michaelis-Menten kinetics [481 ]: 
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(6) 



where the summation occurs over all (the types of) 
transporters; m is the amount of the i-th transporter 
molecules on the unit area of membrane; Kjj i is the dis- 
sociation constant of the i-th transporter-ligand complex; 
Tj is the time, required for the transporter to bind and 
carry one molecule across the membrane; C - compound 
concentration. From Eq. [6] it follows that if C -C Kp i , 
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Figure 4: Intestinal permeability of actively absorbed com- 
pounds. Pink points are observed values. Solid line is the 
model of passive absorption. 



then J act — > (the compound is passively absorbed). In 
the opposite case C ^> Kjj i 

J act = Enf/Tj 

i.e. a compound actively absorbed and active compo- 
nent of a drug flux is independent of drug concentra- 
tion and determined only by the amount of the protein- 
transporter and time, required for a transporter to carry 
a ligand across membrane. Thus Eq. [6] is similar to a 
classifier, with threshold value C, which "selects" between 
the passive and the active transport options ("possibili- 
ties") . Therefore it is natural to build a classifier model to 
identify proteins that either participate in active trans- 
port directly, or have binding site similar to that of a 
transporter. 

To identify the proteins related to active absorption or 
with active site force field similar to that of protein trans- 
porters we used all drugs that are reported to be actively 
absorbed and have permeability not less than predicted 
by passive model (41 compound). Besides, 71 passively 
absorbed drugs were used. We studied absorption-Kd 
relations for these drugs and proteins from our set and 
identified a protein that correctly classified 78% of drugs 
between actively and passively transported. Fig. [Ushows 
that practically all compounds with at least some small 
affinity for the protein are actively absorbed. Using John 
Piatt's sequential minimal optimization algorithm for 
training a support vector classifier implemented in Weka 
Data Mining Software we build up a classifier model. The 
confusion matrix, as shown on Fig. \S[ shows that only 7% 
(5 from 66 passively transported compounds) were mis- 
takenly classified by the model as actively transported. 
These outliers may in fact be false-positives, which affin- 
ity for the protein was mistakenly calculated as high. 
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Figure 5: Relation between affinity of a compound for hu- 
man brain hexokinase type I and intestinal absorption mech- 
anism of the drug. Blue - actively absorbed compounds, red 
- passively absorbed compounds. X axis - pKd value for the 
hexakinase. Y axis - the number of passively and actively ab- 
sorbed compound. The confusion matrix shows accuracy of 
prediction of active transport with the help of human brain 
hexakinase. 
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Figure 6: Permeability prediction for actively absorbed com- 
pounds. Predicted permeability for correctly classified com- 
pounds is plotted against experimental values. Both axis are 
in logarithmic scale. 



Fig. [6] shows permeability prediction for compounds 
that were correctly classified as active using Eqs. [2] and 
[5] where n^n = 3.2E - 8 M * cm^s' 1 and C = 10 E - 
30 M* cmT 3 . 

Fig. [5] shows that there are a lot of actively trans- 
ported compounds among drugs with zero affinity for 
the protein. This suggests that there should be other pro- 
teins that transport misclassified drugs. The fact that we 
failed to find out them suggest that our protein set misses 
some active site types that are important for intestinal 
drug absorption. The confusion matrix shows that only 
51% (21 from 20 actively transported compounds) were 
classified as such. However, if we consider only drugs 
with non-zero affinity for the protein (4 right bars on the 



histogram) 21 from 26 compounds (81%) were actively 
transported, showing that if a compound has affinity for 
the protein it is most probably subject to active transport 
during intestinal absorption. It is necessary to identify 
other proteins that classify drugs as actively transported 
compounds. 

In summary, we started from the premise that active 
drug transport can be predicted by its affinity for some 
proteins, which in fact not obligatory are transporters 
but have active site force field similar to that of trans- 
porters. Using this approach and John Piatt's sequential 
minimal optimization algorithm for training a support 
vector classifier we succeeded to identify a protein which 
affinity for drugs correlates with the active absorption 
of these drugs in 81% cases. This protein can be used 
for estimation of the drug transport mechanism. The 
high percent of actively absorbed compounds that the 
model mistakenly classifies as passively absorbed indi- 
cates that there are other proteins that transport these 
outliers. The fact that we failed to identify them suggests 
that our current protein set misses some protein active 
site types that are important for protein absorption and 
future steps should be taken to enrich our protein set 
with such active sites. Nevertheless, identification of the 
first protein that classifies compounds between actively 
and passively absorbed shows that the proposed concept 
for prediction of drug absorption is correct. 



IV. DISCUSSION 

At temps have been made to develop a theoretical 
model of oral absorption and intestinal permeability 
[Ti . Nevertheless, the models described only pas- 

sive (trans and paracellular) drug absorption, while ac- 
tive transport is an important part of it [18|. In this 
paper we presented a novel approach to in silico predic- 
tion of intestinal permeability for actively transported 
compounds using binding data to some proteins. The 
developed model ^ has the standard passive terms and 
includes an additional active permeability term. When 
the last one is put to zero the model reduces to a model of 
passive absorption. Fig. [3] shows that the reduced (pas- 
sive absorption) model fairly good describes permeability 
of passively absorbed compounds. There were only three 
apparent outliers (hollow squares): bupropion, bosentan, 
and remikiren. If there were no errors in experimental 
data (either permeability or distribution coefficient) we 
assumed that these compounds were subjected to either 
efflux or intestinal metabolism. Fig [4] shows that the 
majority of actively transported compounds were above 
passive absorption curve, indicating existence of an ad- 
ditional component of permeability, besides passive one. 
Nevertheless there were also outliers, which permeabil- 
ity were substantially below passive permeability curve: 
fosinopril, diphenhydramine, lobucavir, and cefuroxime 
axetil. This can be due to inaccuracy in experimental 
data, or due to efflux of the drug from the cell or due to 
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intestinal metabolism, that does not discovered yet. 

The active absorption term in our model is determined 
by Michaelis-Menten kinetics [ill]. If a compound has 
high affinity for important for active transport proteins 
it is predicted to be actively absorbed otherwise the ac- 
tive component is low and the compound is largely pas- 
sively absorbed. The affinity data can be estimated using 
molecular docking software. However, there are practi- 
cally no resolved structures of transporters, nevertheless 
there may be proteins with similar force field in the ac- 
tive site that are solved and can be used for docking. 
Indeed, we identified the first such protein . It is hu- 
man brain hexokinase. From 26 compounds with high 
calculated affinity for the protein 21 (81%) were actively 
transported. It is not a transporter, in fact, however the 



correlation suggests that that there is a transporter with 
similar active site force field. Thus the transporter can be 
represented by this protein. Fig. [Sfchows that the model 
using affinitiesfor this protein gives reasonably good pre- 
dictions for correctly classified compounds. 

Further efforts should be made to identify the rest of 
the active site types important for drug absorption in the 
intestine. Fig. [5] shows that there are a lot of actively 
transported compounds among drugs with zero affinity 
for the protein. This suggests that there should be other 
active site types relevant to drug active transport. The 
fact that we failed to identify them suggests that our pro- 
tein set misses some active site types that are important 
for intestinal drug absorption. Thus the work should be 
continued in this direction. 
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